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[57] ABSTRACT 

A method for detecting memory access errors which occur 
while executing a computer program. Spatial and temporal 
attributes are provided for a data object and these attributes 
are associated with each pointer to that data object. On a 
dereference to a pointer, a memory access check is per- 
formed which determines (a) whether the dereference falls 
outside the address range within which valid accesses may 
be made to the data object, and (b) whether the dereference 
falls outside the time period within which valid accesses 
may be made to the data object If the dereference falls 
outside the valid address range, a spatial error is flagged. If 
the dereference falls outside the valid time period, a tem- 
poral error is flagged. In addition, a method is described for 
converting a preexisting source-language program file into a 
safe program and a method is described for optimizing 
memory-access checks. 
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void *malloc(unsigned size) { 
void *p; 

p.base = p.value = unsafe_malloc(size); 

p.size = size; 

p.storageClass = Heap; 

p.capability = NextCapability(); 

I nsertCapabi I i ty (p. capabi I i ty ); 

bzero(p. value, size); /* also make capability HEVER=0*/ 

return p; 

void *calloc(unsigned nelem, unsigned elsize) { 
return malloc(nelem*elsize); 



void *realloc(void *p, unsigned size) { 
void *new; 
new = malloc(size); 

bcopy(p.base, new.base, min(size, p.size)); 

free(p); 

return new, 



void free(void *p) { 
if (p.storageClass != Heap) 
FlagNonHeapFreeO; 
if (!ValidCapability(p.(^pability)) 

FlagCXiplicateFreeO; 
if (p.value != p.base) 

FlagNonOriginalFree(); 
DestroyCapability(p.capability); 
unsafe_free(p. value); 

} FIG. 11. 
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void Func(int a) { 
/* procedure prologue */ 
unsigned frameCapability = NextCapability(); 
lnsertCapability(fir^^ 

ZeroFramePointersO; 

/* also make capability MEVER = 0 */ 



/* procedure epilogue, common function exit point 7 

DestroyCapability(frameCapability); 

return; 

man 
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void ValidateAccess (<type> *addr) { 

if (freeCount I s cur rentFree Count) { 
if ( (storageClass 1= Global) && 

( ! ValidCapability (capability) ) ) 
FlagTemporalError () ; 
freeCount ■ currentFreeCount; 

} 

if (lastDerefAddr ! = addz) { 

if ( ( (unsigned) addz- (unsigned) base) > 
(size-sizeof (<type>) ) ) 
FlagSpatialError () ; 
lastDerefAddr = addr; 

} 

/* valid access! */ 



FIG. 15. 
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METHOD FOR DETECTING COMPUTER 
MEMORY ACCESS ERRORS 

FIELD OF THE INVENTION 

The present invention relates to computer programming 
methods, and more specifically to a method for detecting 
computer memory access errors during program execution. 

BACKGROUND OF THE INVENTION 

Conventional compute program validation techniques 
contemplate checking the "type" of the access requested 
against the "type" of the data item being accessed. (A data 
item is often referred to as an "object" or a "data object*.) 
An analogy might be to describe the attempted access's type 
as a key, and the object's type as a lock. The computer 
program compiler and/or run-time support could check the 
lock (as defined by the object type of the data object) to 
prevent accesses to that data object by any attempted access 
whose key (as defined by the access type of the access) does 
not meet the requirements of that lock. For example, the C 
programming language defines many different object types, 
some examples being integer (int), floating point (float), or 
pointer (*). When an expression in the program attempts to 
access the object, the compiler checks the access type of the 
access against the object type of the object An attempt to 
access an integer object with a floating point access should 
be detected as an error by the compiler, and flagged as such. 

In the following discussion, a pointer is a program vari- 
able which contains the address of another variable and also 
possibly contains attributes describing the variable pointed 
to; a pointer can be used to derive an address (called a 
pointer value) to be used to access a data object located in 
computer memory. Generally, a pointer is also located in 
computer memory (either in storage or in a register). A 
pointer provides one level of indirection, in that rather than 
addressing the data item itself, a program can address the 
pointer, which in turn provides the pointer value used to 
address the data item. 

The referent of a pointer is the variable (also called the 
data object or the object) whose memory address is con- 
tained in the pointer. The contents of a pointer can be copied 
into other pointers, and thus several different pointers to a 
single referent may exist at the same time. 

The type of a pointer specifies certain attributes of the 
referent (e.g., srKcifying to the compiler that the referent is 
"integer" type as opposed "floating point" type). 

A memory access is, e.g., a read or a write to a referent 

The term dereference is used as a blanket term for any 
indirect memory access (i.e., a memory access through use 
of a pointer) of a referent — either through application of the 
dereference operator (e.g., or *->* in C) to a pointer, or 
through indexing an array or pointer variable (e.g., *[]* in Q. 

Programming errors are costly, both in terms of time and 
money. Memory access errors are particularly troublesome. 
A memory access error is any dereference of a pointer or 
subscripted array reference which attempts to read or write 
storage outside of the referent This attempted access can 
either be outside of the address bounds (also called the a 
address space) of the referent, causing a spatial access error, 
or outside of the lifetime of the referent, causing a temporal 
access error. Indexing past the end of an array is a typical 
example of a spatial access error. A typical temporal access 
error is assigning to a heap allocation after it has been freed. 

Memory access errors are possible in programming lan- 
guages with arrays, pointers, local references, or explicit 
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dynamic storage management and are an important class of 
errors to reliably detect For example, in {Miller:90}, Miller 
et al. injected random inputs (aJoa. "fuzz") into a number of 
UNIX utilities. On systems from six different vendors, 

s nearly all of the seemingly mature programs could be 
coaxed into dumping core. The most prevalent errors 
detected were memory access errors. In {Sullivan:91}, Sul- 
livan and Chillarege examined IBM MVS software error 
reports over a four year period. Nearly 50% of all reported 

10 software errors examined were due to pointer and array 
access errors. Furthermore, of these errors, 25% were tem- 
poral access errors. 

Memory access errors are difficult to detect and fix 
because the effects of a memory access error may not 

15 manifest themselves except under exceptional conditions 
and, when they do occur, they may be difficult to reproduce. 
In addition, once the error is reproduced, it may be very 
difficult to correlate the program error to the memory access 
error. 

20 Consider the following C function: 



inl Findlbkca(int Mate, int count, int token) { 
inl i = 0, *p = data; 
„ while ((i < count) && (*p 1= token)) { 
p++; rt+; 

} 

return (*p = token) ; //error this tests beyond data if do 
token is found 

} 

30 

This function contains a latent memory access error in the 
return statement expression. In operation, the function will 
reference the word immediately following the array refer- 
enced by the pointer data if the array does not contain the 

35 token; if the word immediately following the army then does 
contain the token, the wrong value will be returned by return 
(*p=token);. To avoid this error, the expression return 
(*p=token); should be changed to return (i<count);. 
This function illustrates the three difficulties in finding 

40 and fixing memory access errors. First, FindTokenO will 
only produce an incorrect result if the word following the 
array referenced by data contains the same value as token (or 
is inaccessible storage). This event is unlikely if the word 
contains an arbitrary value. Second, if FindTokenO creates 

45 an incorrect result, it will be difficult to recreate during 
debugging. The programmer will have to condition the 
inputs of the program such that the word following the array 
referenced by data once again contains the same value as 
token. If the value of the illegally accessed word is inde- 

50 pendent of the value of token, the probability of success will 
be very low. Third, correlating the visible errors of the 
program to the incorrect actions of FindTokenO may be very 
difficult This connection may be very subtle and may not be 
visible for a- long period of time. 

55 Debugging can be viewed as an attempt to correlate a 
program fault to a program error. A program error is defined 
as an output of a program that is incorrect with respect to the 
specification of that program — this effect is what the users 
see. The program fault, on the other hand, is the initial 

60 incorrect condition (possibly many) that ultimately caused 
the error condition to occur. The primary goal of any good 
debugging environment is to detect errors and provide a 
good correlation between errors and faults. It is preferable to 
detect memory access errors immediately, thus creating 

65 perfect correlation between the error and the fault 

Many execution environments do provide some level of 
protection against memory access errors. For example, in 
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most UNIX based systems, a store to the program text will The reference-chaining technique has a number of useful 

cause the operating system to terminate execution of the properties. First, it is possible to ensure temporal safety by 

program (usually with a core dump). UNIX typically pro- destroying all pointer values on a referent's chain when a 

vides storage protection on a segment granularity— the referent is freed (i.e., when the memory for that referent is 

segments are the program text, data, and stack. Other, more s deallocated)— simply by stepping down the reference chain 

hostile environments such as MS-DOS, do not offer such «f that refaent, and assigning NULL to all pointer values, 

luxuries, and stores to the program text may or may not Second, if a "destmcted" pointer value is the last value in the 

manifest themselves as a program error. If a program error reference chain, it will be as the result of a 

does occur, correlating it to a fault may be difficult, if not i ^ ^"Tt ^ °f ■ «Lot 
' 6 ' 10 ^ etected unmediately. (A storage leak is any area in storage 
impossible. to which the program can no longer generate a valid pointer 
As programs become larger and more complex, there is a (generatin s £ ch a pointer to also called generating a "name- 
need for more sophisticated and comprehensive develop- for ^ ^ Storagc _ leak ^ occur when me last acces . 
ment tools to help the programmer debug these programs. sible valid t0 a heap ob ^ t h 0Verwr itteiL Without 
In particular, there is a movement in the prograrnming me ability to g enerat e a name to the heap object, the heap 
community towards "safe progranuning" techniques, lan- 15 object cannot be freed; hence it has "leaked" out of the 
guages and tools. Unfortunately, many of these safe pro- heap.) 

gramming techniques sacrifice the expressiveness otherwise Unfortunately, the reference-criaining technique cannot 

available to the programmer using a programming language be made to work reliably in G It is relatively easy for the 

like C or C++. programmer to subvert the checking mechanism through 

One safe programming technique is to check the "spatial 20 recasting and type-less calls to free(), the memory deallo- 

validity" of a particular pointer value used to access a cation function. Detection of storage-leak errors also fails in 

particular data object, e.g., checking that the access goes to the presence of circular references, where a chain of 

an address which is within the address bounds defined for pointers-to-pointers eventually refers back to an earlier 

that object. Any program which incorporates such spatial pointer. Additionally, the reference-chaining technique can 

validity checks is said to exhibit "spatial safety". Ideally, the 25 be unreliable became it depends on tracking pointer values, 

tools used by the programmer would check each attempted Researchers have recently proposed providing complete 

access for spatial validity, and would detect, flag, and program safety through limiting the constructs allowed in 

identify any spatial error to the programmer so the error the programming language. The main thrust of this work is 

could be corrected. to design progranuning languages that support garbage 

Another safe programming technique is to check the 30 collection reliably and portably (i.e., in a manner in which 

'temporal validity" of using a particular pointer value to an implementation can be re-used across several different 

access a particular data object, e.g., checking that the data programming languages or computer architecture 

object is indeed allocated before it is written to, initialized platforms). For example, in "Safe:GC" {Safe:GC}, a safe 

before it is read, and is neither read nor mitten to after it has subset of C++ is defined. The safe subset docs not permit any 

been freed or destroyed. Any program which incorporates 35 invalid pointers to be created. For example, pointers cannot 

such temporal validity checks is said to exhibit **ternporal be created via explicit pointer arithmetic. If requested, the 

safety". Again, ideally, the programmer's tools would detect compiler can enforce safety within a module by ensuring 

any temporal access error, and would flag the temporal error that the programmer does not use any intrinsically unsafe 

to the programmer so it could be corrected. operations. The safe subset requires that some amount of 

There are times when a programmer would like to be 40 checking be performed, 

notified at the moment when a data object is accessed, or In addition, languages which can easily be made totally 

informed of how many times a data object is accessed or of safe have existed for a long time. Fox example, many 

which pointers) were used to access a particular data object FORTRAN implementations provide complete safety 

One technique for providing this object-access information through range checking (e.g., {MDPS:F77}). However, as in 

to the programmer is to instrument a program by adding 45 Sare:GC, these languages tend to be less expressive than 

watchpoints. Instrumenting a program inserts additional intrinsically unsafe languages such as C or C++, 

code into a program in order to perform some auxiliary task. A number of commercially available memory access 

However, providing such watchpoints to a flexible and checking tools exist for memory access checking. For 

expressive language such as C/C++ can be difficult and instance, Hastings and Joyce's ''Purify" {Purify :92} uses a 

cumbersome, and can significantly slow the execution speed 50 safe programming technique which is particularly easy to 

of the program, use because it does not require program source — all seman- 

One technique for creating a safe programming environ- tic changes to the program are applied to the object code* 
ment for C is to employ a reference-chaining technique. This Purify supports both spatial- and temporal-access error 
technique is similar to that used by many "smart pointer* 1 checking to heap storage, through the use of a memory state 
implementations {Edelson:91,Ginter:92}. The reference- 55 map which is consulted at each load and each store that the 
chaining technique creates a reference chain for each data program executes. Purify also provides uninitialized read 
object in the computer system and **roots" (or otherwise detection and storage-leak error detection through a "con- 
associates) each reference chain with its data object This servative collector" {Bc«hm:933oehm:88} (described in 
technique then inserts, into the reference chain rooted at the more detail below). Certain heap spatial access errors are 
referent, any pointer to that referent which is generated 60 detected by bracketing both ends of any heap allocation with 
through the use of an explicit memory allocation (e.g., the a "red zone". These zones are marked in the memory state 
raaliocO function in the C language), a reference operator map as inaccessible. If a load or store touches a red zone, 
(e.g., the operator in the C language), or an assignment then a memory access error is flagged. Temporal access 
(e.g., the ,*=' operator in the C language). When a pointer is errors are detected by setting the memory state of freed 
later destroyed (e.g., through memory deallocation, 65 storage to inaccessible". 

assignment, or return from a procedure), this technique then Purify cannot detect ail memory access errors. For 

removes the pointer from the reference chain. example, errors caused by accessing past the end of an array 
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into the storage region of the next variable cannot be 
detected, nor can errors caused by accessing storage that has 
been treed and then reallocated. These limitations occur 
because Purify does not determine the intended referent of 
memory accesses — it can only verify whether the accessed 
storage is "active". To increase the effectiveness of temporal 
access error checking, Purify "ages" the heap, i.e., holds 
freed storage in the "heap free lisr longer than needed. This 
aging increases the storage requirements of programs that 
use the heap. In addition, although Purify is portable across 
programming languages (as long as each language is avail- 
able on the given computer architecture platform for which 
Purify is implemented), it is not portable across platforms, 
and must be re-written for each platform on which it is 
desired. 

Hastings' U.S. Pat No. 5,193,180, issued Mar. 9, 1993 
and assigned to Pure Software Inc., describes an implemen- 
tation of the Purify technique. An object-code expansion 
program inserts new instructions and data between preex- 
isting instructions and data of an object-code file; offsets are 
modified to reflect new positions of the preexisting instruc- 
tions and data. The added instructions monitor substantially 
all memory accesses to check for the errors of writing to 
unallocated memory, and reading from unallocated or unini- 
tialized memory. Dummy entries are added to the data 
section to aid in the detection of array-bounds violations and 
similar data errors. Furthermore, watchpoints can be estab- 
lished for more comprehensive monitoring. 

Another safe programming technique is used in Steffen's 
4 *RTCC" {Steffen:92}. KTCC extends the functionality of 
the AT&T C language compiler € PCC by adding spatial- 
error checking. KTCC attaches spatial object attributes to 
pointers and performs spatial access error checking. It does 
not, however, detect temporal access errors. In the imple- 
mentation of KTCC, the issue of interfacing to library and 
system calls is addressed through "encapsulation"; Steffen 
also describes augmenting "sdb" (the UNIX system 
debugger) to provide users with transparent debugging sup- 
port. 

Another safe programming technique is used by "Code- 
Center" {Kaufen88}. CodeCcntcr is an interpreted C lan- 
guage environment The checking provided is very rich — it 
detects many memory access errors, and also provides 
dynamic type-checking (i.e., the type of the last store to 
memory must match the type of subsequent loads from 
memory), uninitialized read detection, errant free detection, 
and other useful checks. (The following heap-deallocation 
actions are called errant frees: freeing storage which has 
been previously freed; freeing non-heap (global or stack) 
storage; freeing an invalid address (one that does not refer to 
valid storage); freeing heap storage using an interior pointer 
(a pointer that points inside the allocation, rather than to the 
start of the allocation).) Object attributes (namely, type and 
size) are attached to each data object in storage when it is 
initialized and, when a reference is made to storage, the base 
and size attributes that are associated with the referent 
storage are also attached to the pointer value. Using this 
information, CodeCenter provides complete coverage for 
spatial access errors. The method used for temporal access 
checking cannot, however, detect all attempts to access freed 
storage after it has been reallocated for another purpose nor 
can it detect errors when pointer references are made to local 
variables. In addition, CodeCenter has large resource 
requirements; since CodeCenter programs run in an 
interpreter, the slow execution speed may discourage its use, 
and in the case of long-running programs, may preclude its 
use entirely. 
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Another safe programming technique is used by Integral 
C" {Ross:87}. Integral C is an integrated pro grammin g 
environment for the C language. The user interface is very 
similar to CodeCenter. Internally, however, it does not 
5 employ an interpreter. Instead, as the programmer/user 
updates the C code, the C code is incrementally compiled (at 
function granularity) into machine code. Like KTCC, Inte- 
gral C attaches only base and bound attributes to pointer 
values, and thus it can only detect spatial access errors. 
10 Yet another safe programming technique is used by Fis- 
cher and LeBlanc's "UW-PascaT compiler {Fischer:80}. 
UW-Pascal supports both temporal and spatial access error 
checking, but while UW-Pascal detects all spatial access 
errors, certain temporal access errors may not be detected if 
15 storage is reallocated Because UW-Pascal lacks mutable 
pointers (pointers which may be used as terms in arithmetic 
expressions, thus allowing their value to be arbitrarily 
manipulated by the program) and dynamically-sized arrays, 
however, its access checking is much easier to implement 
20 than the error checking of other techniques which handle 
these more expressive and flexible progiamrning-language 
features. 

This paragraph briefly summarizes properties of the above 
described commercially-available systems. The technique 
25 used in Purify operates on object-code flies, performs an 
object-code translation, provides spatial checks limited to 
heap spatial access errors, provides temporal checks limited 
to heap temporal access errors, and has extensions that can 
detect errant tree's, uninitialized reads, and storage-leak 
30 errors. The technique used in RFCC operates on C files, uses 
a safe compiler, provides spatial checks for all spatial access 
errors, but provides no temporal checks. The technique used 
in CodeCenter operates on C or C++ flies, uses an 
interpreter, provides spatial checks for all spatial access 
35 errors, provides temporal checks for some temporal access 
errors, and has extensions mat can detect errant tree's, 
uninitialized reads, type errors, arithmetic errors, etc. The 
technique used in Integral C operates on C tiles, uses a safe 
compiler, provides spatial checks for all spatial access 
40 errors, but provides no temporal checks. The technique used 
in UW-Pascal operates on Pascal files, uses a safe compiler, 
provides spatial checks for all spatial access errors, provides 
temporal checks for some temporal access errors, and has 
extensions that can detect errant tree's, union type errors, 
45 arithmetic faults, etc. 

A closely related area of work, which can benefit from the 
safe programming technique described in the invention, is 
storage-leak error detection. For languages like C and C++, 
storage-leak error detection is commonly implemented with 
50 a "conservative collector" {Bochm:93 3oehm:88}. A con- 
servative collector sweeps memory looking for unreferenced 
storage. Because it is difficult to know where all pointers are 
located, the coDector makes the conservative assumption 
that all program-accessible (non-heap) storage contains 
55 pointers. It then uses a traditional mark-and- sweep collec- 
tion method. 

While effective, this method has some drawbacks. First, 
storage-leak error detection is not immediate — it is usually 
applied only when the programmer demands it or when the 
60 program completes execution. Thus, for it to be useful, some 
dynamic information (for instance, a partial call-chain) must 
be kept with allocations in order for the r^grammer to 
deduce the circumstances under which the storage-leak error 
occurred. Second, the conservative assumption (that all 
65 program-accessible (non-heap) storage contains pointers) 
can cause 'false hits". (False hits occur when "random" 
non-pointer values, which seem to reference heap storage, 
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are mistaken for pointer values.) False hits can hide an actual pointer to that data object On a dereference to a pointer, a 
storage-leak error. For instance, it may appear to the con- memory access check is performed which determines (a) 
servative collector as though some random number is a whether the dereference falls outside the address range 
pointer to an area of storage on the heap; in actuality the within which valid accesses may be made to the data object, 
storage pointed to by the random number leaked from the 5 and (b) whether the dereference fails outside the time period 
heap when the last valid pointer to it was destroyed in error. within which valid accesses may be made to the data object 
This problem is aggravated by large storage allocations. In If the dereference falls outside the valid address range, a 
such allocations it is more likely that non-pointer values may spatial error is flagged. If the dereference falls outside the 
randomly and inadvertently reference the allocated storage; valid time period, a temporal error is flagged, 
unfortunately, it is these large storage-leak errors that the 10 According to another aspect of the present invention, a 
programmer would most like to find. Third, if the program method is described for converting a preexisting source- 
hides pointers (for example, by encoding type information in language program file into a safe program. In particular, a 
the upper bits of the address in a pointer) or does not keep method is described for optimizing the inclusion of a 
all pointers within the address bounds of memory memory-access check According to the method, memory 
allocations, then the conservative collector may not recog- is deallocations are monitored and memory-access error 
nizc a valid pointer, and thus may erroneously regard a piece checking code is skipped during periods when no new 
of heap storage as having leaked from the heap, when memory deallocations have occurred. 

actually it is still in use. 

(A call-chain is the state of the stack at some point in a BRIEF DESCRIPTION OF THE DRAWING 

program's execution; it is composed of a sequence of 20 FIG. 1 is a partially schematic representation of the 

function names; functions higher in the call-chain call compute program correctness-enhancement method accord- 

(possibly indirectly) the functions lower in the call chain; ing to the invention. 

neighbors in the call-chain share a direct caller-callee rela- FIG. 2 is a detailed block diagram of certain features of 

tionship. A partial call-chain is a subset of the current a preexisting source-level program 190 of FIG. 1. 

complete call-chain, usually taken from the bottom of the 25 HG. 3 is a detailed block diagram of certain features of 

complete call chain; partial call-chains are usually employed a safe program 190' of FIG 1 

to reduce storage requirements.) M . n 

Zorn and Hilfinger's "mprof takes a notably different Ha 4 18 a flowchart depicting the overall operation of 

approach to detecting storage-leak errors {Zorn:88}. During coflVC ™°n means 195 of FIG. 1. 

the analyzed program's execution, mprof maintains a table 30 5 & a detailed block diagram of the process pointers 

of partial call-chains, with each table entry containing a ste P 4 ^0 of FIG. 4. 

count of how many malloc()'s and freeO ' s have occurred to FIG. 6 is a detailed block diagram of the process operators 

storage whose call-chains terminated with that sequence. step 430 of FIG. 4. 

Detecting storage-leak errors then involves adjusting the FIG. 7 is a detailed block diagram depicting the capability 

appropriate counts at calls to mallocO and freeO. At a 35 store 350 of FIG. 3. 

mallocO, the current call-chain is used to increment the FIG. 8a shows a flowchart of an embodiment of step 440 

appropriate malloc() counter. At a free0, a hidden pointer in 0 f FIG. 4. 

the header of the freed allocation is used to increment the mQ ^ ^ a flowchart ±c mcthod uscd by 

corresponding freeO counter. At program termination, detec- checking step 450 of FIG. 4. 

tion of storage-leak errors involves reporting the partial 40 CT « n , . * ... _ , .... 

call-chains whose mallocO and freeO counts differ. f mG 'l ***** tive emr^dnnent of &e capability 

Unlike conservative collection, the mprof technique does ^^J**"? ltS associatec ^ niamtonance functions, 

not suffer from false hits; that is, a true storage-leak error *J* 10 shows an exam P le of reference operator operand 

will always be detected. In addition, mprof provides a pacing* 

wealth of other information useful for optirnizing a pro- 45 ^G. 11 shows a C-Ianguage embodiment of a safe version 

gram's memory usage. The primary disadvantage of the °^ ma ^ oc 0 afl d freeO* 

mprof technique (compared to conservative collection) is ^9" 12 snows a C-Language-like embodiment of a safe 

that storage-leak diagnostics may only be gathered after function call and return. 

execution completes, and many programs do not deallocate FIG. 13 shows an example of a spatial access error, 

storage until program termination (e.g., in C, the call to 50 FIG. 14 shows an example of a temporal access error. 

exit() will ensure that all the program's resources are FIG. 15 shows a C-language embodiment of access 

deallocated/reclaimed). This behavior can yield many checking code that optimizes run-time execution, 

(arguably) false indications of storage-leak errors. fig. 16 details an alternative embodiment if safe pointer 

None of the above methods provide the detection of 310 which facilitates optimization of the run-time execution, 

temporal and spatial errors needed in the sophisticated 55 na 17 is a ^ showing programs which wcre mca . 

programming environments of today without either impact- surcd while using ±e invention. 

ing the flexibility of the programming language or overlay- ' 

ing an oppressive amount of overhead. What is needed is a DESCRIPTION OF THE PREFERRED 

method of detecting memory access errors which can oper- EMBODIMENT 

ate over a variety of programming languages while having 60 In the following detailed description of the preferred 

minimal impact on program execution. embodiments, reference is made to the accompanying draw- 

*?TTM\f arv op twp tntwutthm m 8 s whlcn f 01111 a P 3 * hereof > aad which are shown by 

SUMMARY OF THE INVENTION way of mustrationspeeffic embodiments mwMch me inven- 

The present invention provides a method for detecting tion may be practiced. It is to be understood that other 

memory access errors which occur while executing a com- 65 embodiments may be utilized and structural changes may be 

puter program. Spatial and temporal attributes are provided made without departing from the scope of the present 

for a data obj ect and these attributes are associated with each invention. 
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It is preferable for a programming language execution program 190 will generally attempt many accesses to vari- 

environment to support memory access protection at the ous data objects 221 using various pointers 210. 

variable level, that is, an access to a variable should only be Operator 230 depicts an operator in preexisting program 

valid if the access is within the range (for both time and 190. Preexisting program 190 typically will contain many 

space) specified far the intended variable. All other accesses 5 operators 230. 

should immediately flag an error. Any program that supports To cn f orcc access protection, the notion of a pointer value 

these execution semantics is called a safe program. must be extended to include information about the referent 

FIG. 1 illustrates a computer program correctness- The idea is similar to tagged pointers used in many Lisp 

checking method 100 used to generate a safe program from implementations. FIG. 3 depicts safe program data 

unsafe program code. In the method of FIG. 1, a program- 10 structures, including a safe pointer 310, a data object col- 

mer writes an initial program 190 using any modern high- lection 320, safe operators 330 and a capability store 350 

level language, e.g. C, C++, Ada, PL1, or Pascal. Such a which can be used within safe program 190*. FIG- 3 also 

program would typically be compiled, but could also be illustrates checking code 340 which is added to program 

interpreted or translated. In one embodiment, as is depicted code 190 to perform the memory access error checking, 

in FIG. 1, a source-level program file 190 is transformed, at 15 Dat a object 321 is a block diagram of a typical data object 

compile-time, to use an extended pointer representation generated by conversion means 195. Topically, conversion 

termedasafcr^tCT.Asafeptinta means 195 copies data object 221 into data object 321 

pointer as well as one or more object attributes which without alteration; however, if data object 221 contains one 

describe the location, size and lifetime of the pointer refer- or more pointers 210, each pointer 210 in data object 221 

ent When a safe pointer value is created, either through the 20 win be replaced with a corresponding safe pointer 310 in 

use of the reference operator (e.g. TinQor through ^ objec t 321. Data object collection 320 depicts a col- 

explicit storage allocation, the appropriate object attributes lection of data objects 321 so generated by conversion 

are attached. As the value is manipulated, through the use of mcans 195. Each data object 321 would be the same type and 

pointer operators, the object attributes are transferred to any si2e ^ mc corresponding data object 221, as depicted by 

new safe pointer values. Detecting a memory access error & ^ ob j ects $21a, 321fc, 321c, . . . 321/1. 

toen simply involves ^validating derefaences against the Safe operator 330 depicts a safe orator m safe program 

object attrftutes-rf toe access is within the .address-space w generatedby com £ sion means^S. The safe^am 

and time bounds of the object, it is r^rmitted, otherwise an 190 - % m ^ one safe ^ to 

error is flagged and the access error is detected immediately. ^ - w *y%n 1 an 

* T t , . - . ^ jL Qn each operator 230 in preexisting program 190. 

In the embodiment shown in FIG. 1, conversion means 195 30 „ _ A . , 1 - * 

fk* Ian Capability store 350 stores temporal validity information 

takes the preexisting source-language program 190 ,T _j A , ..r^\ , , J ~ 

CTNITFIL&C") as input and generates a safe program 190' «sed to perfaim femoral vabdrty checks on der eferences to 

("SAFEF1LE.C ) as one step of the overall ^compilation Safc pomt ^ V*l2???*1 340 »^ omI 

L. program code inserted into the safe program 190' to perform 

, . , t memory access checks when safe operator 330 attempts to 

In one such embodiment, conversion means 195 is a " dereference safe pointer 310 

general^purpose computer having a memory and operating Safe uo . one safe mter 310 coire _ 

under the conttol of a stand-alone computer program which spohii ££ caxb pointer 210 * ^ prograin 190 . 

is executed before fce compilation step whose output is £ ch „» ^ 3 P 10 ^ to , ^ ^^J^. ^ 

another source-level program file 190 Since conversion ^^u^^J^si^^vdJamiMA 

means 195 takes source-levelprogrammmg language files as 40 ^ to ^ same ^ ^ 321 in me course 

input, checking schemes can be compreheiuive; the conver- $ f CTecuti ^ lw ' 

sion means can determine the valid address bounds and tn „„•„„„ / . „ K :.JT in „„•.„ „7X..._ „*1 

■ ^ , , . . . .. accesses to various data objects 321 using various safe 

temporal validity times for the referents in preexisting pointers 310 

source-language program 190. In one particular ... . " . . . ... ^ - , . 

embodiment: conversion means 195 might include a pre- 45 % ^T^?™^ T 2* 1 

compiler such as the AT&T USL C++ cfront Compiler 3 " ^ a t^f 6 *S" f U - 7*! e &e f 3U specifies the type 

Version 3.0.1 from AT&T Corp. and a compiler suchalthe ° f **! ^ f» ^ which safe pointer 310 refers. 

MIPS CC compiler Version 2.1 available from MIPS Tech- * ft* *f* 3 " TT™ COpy f ^ 

nology Inc., running on a DECstation 3100 computer from contents ^ d ^J??**. «« 

Digital Equipment Corp. of Maynard, Mass. 30 5f?7 °n *5 *?* W to P 0 "** 

zl^ 7? , , . . . , 310 refers. Typically, * value field 312 would contain a copy 

FIG, 2is ablockdiagram200 depicting pointers 210, data of ^ of * yaluc field 212 

Ejects 221 and operators 230 of preexisting program 190 u &d<M ho safe ^ 310 mdudes a tial 

? 55- ^ Wif 61 ?^ 22 l depiCtS a C ^°* ^ attribute field 313 and a temporal attribute field 315. Spatial 

^^-^^ a ,^ m ^^^ g T^ 55 attribute field 313 includeTa base type field 313.1 which 

f ^ k 38 specifies the data type of the base add£ SS , a *base address 

depicted by data objects 221*, 221a, 221c, . . . 221*. field 313.2 which specifies the address of the lower bound of 

Pointer 210 is typical of preexisting pointers specified in the data object 321 to which safe pointer 310 refers and a 

the preexisting source-level language file 190. Each pointer s jzc field 313.3 which specifies the size of the data object 

210 contains a type field 211 which specifies the type of data ^ 321 to which safe pointer 310 refers. (NOTE: In languages 

object 221 to which pointer 210 refers and a * value field 212 where pointers are immutable (i.e., may only be derefenced, 

which specifies the memory address of the data object 221 or assigned to), base type 313.1 and base address field 313.2 

to which pointer 210 refers. are redundant and may be omitted. Even without this 

The preexisting program 190 typically will contain many information, all spatial access errors can be detected with a 

pointers 210. Each pointer 210 would point to a single data 65 range check.) In one embodiment, the type associated with 

object 221; however, there may be several different pointers the pointer value in base type field 313.1 is the same as that 

210 which point to the same data object 221. The preexisting specified in type field 311. 
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Temporal attribute field 315 specifies the storage class of can be manipulated by the program source; all other mem- 

the data object 321 to which safe pointer 310 refers. It bers are inaccessible. 

includes a storage class field 315.1 which specifies the A safe pointer 310 can exist in three states: unsafe, 

storage class of data object 321 (e.g., Heap, Local, or invalid, and valid. If the object attributes are incorrect, the 

Global), and a temporal capability field 315.2 which speci- 5 pointer has become unsafe and dereferencing this pointer 

fies a temporal capability number representative of a capa- may cause ^ undetected memory access error. Therefore, it 

bility value associated with toe data object 321. When a is fa^^ to ensure that a safe pointer 310 (whether 

dynamic variable (data object 321) is created, either through or valid) never bec omes unsafe. If a safe pointer 310 

explicit storage allocation (e.g., calls to maUoc()) or through is not unsafe ^ it is cithcr or validj depending on 

procedure invocations (e.g., a procedure call creating a local 10 Wflet her the checking invoked by a dereference would flag 

variable in the stack frame of the procedure), a unique an OTOr (an lnvalid ^ 310 will flag an error if it is 

capability is issued to that data object 321, and that unique invoked by a dereference). Erograniming languages with 

capability is placed in capability field 315.2 of safe pointer mutaWe ^ ( Q ^ ^ t0 m 

310thatpointstothatdataobject321.TheuniquecapabiUty winters; for example, a loop iterating a pointer 

is also inserted into an associative store (capability store 15 across ^ ^ elements of an array exits the loop with the 

350) andthenlater deleted from mat store when toe dynamic pointer pointing to ^ mcmoiy location following the last 

storage allocation is freed or when the procedure invocation element. If the invalid pointer is never dereferenced, the 

returns (the exact mechanics of this process are discussed in program wouM not be ^ ^ ^ ms ^ s 

a following section). Hius, the collection of capabilities m pre cisely why the preferred embodiment of the invention 

capability store 350 represent each active valid data object 20 onl Uces ^ ^ 340 at dereferences; itis not 

321 that has not been deallocated. Temporal access errors megal to have m safe pointer 3ia __ only to use it 

occur whenever a reference is made through a stale pointer, ^ inltial value of a safe pointer 310 tf not specified by 

(i e., a pointer which references a data object whose capa- an ^3^^ expression, must be invalid. This condition 

bility is no longer in the capability store). According to the ensures that a dereference which occurs before the initial 

invention, when the program deallocates a data object 321, 25 assignmcnt u dctcctcd A simplc way to invalidate a safe 

the unique capability number for that data object 321 is pointer 310 is to assign ^ unique NEVER to its 

removed from capability store 350; the program need not capability field 315 2 

locate and destroy each stale safe pointer 310 to that data nG 4 is a ^1^1 flowchart depicting one embodi- 

object 321. ment of the overall operation of the conversion means 195 

In one embodiment, two capabilities are predefined. FOR- 30 of FIG. 1. Creating a safe program from its unsafe counter- 

EVER is unique and always exists in the capability store; part involves three transformations: pointer conversion, 

this capability is assigned to all global objects. NEVER is operator conversion and check insertion. Pointer conversion 

unique and never exists in the capability store; this capabil- extends all pointer definitions and declarations to include 

icy can be assigned to invalid pointers to ensure any deref- spacc for objcct attributes. Operator conversion generates 

crence causes an error to be detected. 35 an d maintains object attributes. Check insertion instruments 

By monitoring the storage class of a variable, it is possible the program to detect all memory access errors . Accordingly, 

to detect errant storage deallocations, (e.g., it is illegal to free at 410, a table is prepared containing the parsed preexisting 

a global or local variable; only heap variables may be freed). source code program 190. At 420, pointers 210 of preexist- 

Local objects must be distinguished from heap objects in ing program 190 are replaced with safe pointers 310. At 430, 

order to detect errant frees (i.e., illegal frees of local objects). 40 operators 230 of preexisting program 190 are replaced with 

Information distinguishing local variables from heap van- safe operators 330. At 435, checking code 340 is added 

ables is not completely encoded into the capability field, which checks dereferences of safe pointers 310 by safe 

This distinction is not possible by simply examining the operators 330. In addition, a capability store 350 is defined 

capability .field. This is why storage class field 315.1 was for use by checking code 340. At 440, the resulting safe 

added into the temporal attribute 315. 45 program code is written to safe program code file 190'. The 

In an alterative embodiment the various storage classes safe program code 340 stored in program code file 190' can 

are provided distinct locations in the address space, and the then be executed at step 450 and spatial and temporal access 

storage class value is derived from the address in the base errors occurring at dereferences of said safe pointers 310 by 

address field 313.2. said safe operators 330 can then be detected. 

One embodiment of a C-language-like type definition for 50 Pointer Definition 

safe pointer 310 follows: Each pointer definition and declaration from preexisting 

program 190 must be extended to include spatial attribute 
^— — - ——————— — 313 and temporal attribute 315 (hereinafter collectively 

typcdef * > called the object attributes). To mate this transformation 

» baf £?* 55 transparent, the composite safe pointer 310 rrmst mimic the 

unsigned size;* first class value semantics of scalar pointers. (First class 

eniim {Heap=o, Local, Global} storageClass; values are intrinsic types in a computer language; the 

SafePtr^^ " p 1us F0REVER ^ " operators of the language (e.g., + or *) may be applied to first 

troypo, class values ^ and meir appjjeatfoQ wiU create new values of 

60 the same type; e.g., numeric types are typically first class 

where base and size are the spatial attributes, storageClass values, while composite structures such as C structures are 

and capability are the temporal attributes and the type usually not (operations on them must be defined by the 

definition is parameterized by <type>, the type of the pointer programmer).) That is, when passed to a function, the safe 

referent This <type> could be, e.g., int, float, or any type pointer 310 must be passed by value (a passing convention 

defined by the language or by the program. In one embodi- 65 for function arguments, where a copy of the argument is 

ment of the above C-language-like safe pointer definition, passed by the caller to the callee; the callee may then 

the *value attribute is the only safe pointer 310 member that manipulate the passed-by-value argument without affecting 
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the caller's argument), and when a safe operator 330 is If at 633.1 conversion means 195 determines preexisting 

applied to a safe pointer 310, fee result, if a pointer, must be operator 230 is not an assignment operator, control passes to 

a new safe pointer 310. step 634.1. At 634.1 conversion means 195 determines 

FIG. 5 is a detailed flowchart depicting one embodiment whether the located preexisting operator 230 is a reference 

of step 420 of HG. 4. In this embodiment, at 521 the first 5 operator (a reference operator is a type of operator which 

pointer 210 in the preexisting program 190 is located. At creates a pointer to its operand). If preexisting operator 230 

522, a structure for safe pointer 310 is created which h hdGQd a refercnce to scs to st 634 2 

corresponds to pointer 210 of preexisting pro-am 190 ^ a safe ^ stnjcmre 310 h J atQd ^ me 

Path prefix aVd suffix used to define the access path are 

is loa^d with fee contents ol ^fte tyr^ field 211 of pointer * QM men tQ st 634 3 / t 634 3 

210 of preexisting program 190, and safe pointer *vahie 5 . m7 ^ \- 1 . ^ . 

field 312 is loaded with fee contents of the 'value field 212 <™ v ™ nieans 195 aaMs code which is executed at run 

of pointer 210 of preexisting program 190. At 524, pointer m m safe P° mt< * structure 310 wife data 

210 is replaced by safe pointer 310. generated from the access path prefix and suffix computed in 

At 525, the loop is repeated if any unprocessed pointers ste P ™ s Process is described in detail below in the 

210 remain in fee preexisting program 190, wife step 526 15 explanation of FIG. 10. 

locating the next pointer to be processed. ^ at 634.1 conversion means 195 determines preexisting 

There is no need to add object attributes to array variables. operator 230 is not a reference operator, step 635 is per- 

Array variables (in the C sense) are merely address formed to replace preexisting operator 230 with a corre- 

constants, and thus only exist as statically allocated objects sponding safe operator 330. Control then passes to step 636. 

or within structure definitions; as a result, the spatial 20 At 636 fee loop is repeated if any unprocessed operators 

attributes can be generated from the address constant and its 230 remain in fee preexisting program 190, with step 637 

type size,, and the temporal attributes can be taken from the locating fee next operator to be processed, 

safe pointer 310 to fee containing object or derived from the As summarized in steps 633.1-633.4, fee assignment 

array name. operator requires special handling if the right-hand side is a 

Operator Conversion 25 constant Two common pointer constants are the NULL 

Safe operator 330 must interact properly with fee com- value and string constants (for C). If fee assignment value is 

posite structure of safe pointer 310. When applied, it must NULL, fee NULL value can be replaced by an invalid safe 

reach into the safe pointer 310 to access fee pointer value. pointer value, e.g., one with fee capability NEVER. For 

If a safe operator 330 creates a new pointer value, that string constants, the needed object attributes are generated at 

pointer value must include an unmodified copy of fee 30 compile-time. If fee right-hand side of the assignment is a 

pointer operand's object attributes. (In C there exists one pointer expression, the resulting pointer value (and its object 

operator wife two pointer operands, namely *-\ which attributes) is copied to fee pointer named on fee left-hand 

produces fee difference between two pointers. The semantics side of fee assignment 

of this operator imply feat fee object attributes of both Casting between pointer types does not require any spe- 
operands should refer to the same data object 321, and so fee 35 cial program transformations. Casting only alerts the corn- 
object attributes from either operand can be copied to fee piier feat future pointer arithmetic or dereferences of a 
destination safe pointer.) For example, in fee C statement particular pointer value should be made with respect to the 
q=p+6, the application of the V operator on fee pointer p new type size. Casting to a non-pointer type requires that the 
creates a new safe pointer 310 which is assigned to q. Hie object attributes be dropped (if only pointers carry object 
new pointer value in q contains a copy of the object 40 attributes) and then the cast is carried out as defined by the 
attributes from p. A safe operator 330 which manipulates language. Casting from a non-pointer type to a pointer type 
pointer values never modifies fee copied object attributes is problematic if non-pointer types do not carry object 
became changing fee value of the pointer does not change attributes. TTiis problem is addressed below, 
the attributes of fee data object 321 it references. This As summarized in steps 634.1-634.3, handling of the 
property holds even for pointers to aggregate structures, in 45 reference operator (e.g., the '&* operator in fee C statement 
which case, fee object attributes refer to the entire aggregate. q=&p->b[10]) is slightly more complex as it must generate 
FIG. 6 is a detailed flowchart depicting one embodiment object attributes. In such a situation, fee reference operator 
of step 430 of FIG. 4. In this embodiment, at 631 fee first is applied to an expression (p— >b[10], in this example) 
preexisting operator 230 in the preexisting program 190 is which names some storage. This expression is called fee 
located. At 632 conversion means 195 determines whether so access path. The result of the operation is a new pointer 
fee located preexisting operator 230 is an operator which value to fee referent named by the expression, 
operates on pointers. If preexisting operator 230 is not such To facilitate conversion, access paths are decomposed 
a pointer operator, control passes to step 636 to check for into two parts, a prefix and suffix. The access path prefix is 
other operators. At 633.1 conversion means 195 determines always non-empty and describes the sequence of variable 
whether the located preexisting operator 230 is the assign- 55 names, dereferences, subscripts, field selectors, and pointer 
ment operator. If preexisting operator 230 is indeed an expressions leading to the memory object being referenced, 
assignment operator, control passes to step 633.2. At 633.2 It is from this prefix feat fee temporal attributes are gencr- 
conversion means 195 determines whether fee located pre- ated. The remaining part of fee access path, fee access path 
existing assignment operator 230 operates by assigning a suffix, is composed of a sequence of field selectors and 
constant If preexisting operator 230 indeed operates by 60 subscripts (on array variables only). The suffix describes 
assigning a constant, step 633.3 is performed to create a safe what extent of the object is being referenced. (An extent 
pointer structure 310 and control passes to step 633.4. If means the address bounds of a pointer referent to which a 
preexisting operator 230 does not operate by assigning a dereference is valid; for pointers to composite structures 
constant control passes to step 6334. At 633.4 conversion such as arrays fee valid extent of a pointer may include many 
means 195 performs the assignment specified by the located 65 objects.) 

preexisting assignment operator 230, and control passes to Access paths may be further characterized as direct or 

step 636. indirect A direct access path refers to an object in the global 
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or local space by name. An indirect access path contains at 
least one pointer traversal, that is, at least one dereference of 
a pointer in a pointer expression. 

Given a reference operator expression, the prefix can be 
parsed by traversing the expression tree starting with the 5 
left-most, lowest-precedence operator. The part of the 
expression up to but not including the last pointer traversal 
is the access path prefix, the remaining part of the expression 
is the access path suffix. If the access path does not contain 
any pointer traversals, the access path prefix is the name of 10 
the referenced variable. Table 1 below shows a number of 
expressions and their decomposed access paths, where c is 
an array variable and y is a pointer. 

TABLE 1 15 



Expression 


Prefix 


Prefix Type 


Suffix 


a 


a 


direct 




a.b 


a 


direct 


b 


a.b.c[4].d 


a 


direct 


b.c[4].d 


(**P) [3] 


«p 


indirect 




(*P)->b 


*P 


indirect 


b 


w-n 


w 


indirect 


X 


w— mc— »y 


w->x 


indirect 


y 


w->x->y[3]a-*c[4].b 


w->x-+y[3].z 


indirect 


c[4].b 



20 



Temporal attributes are derived from the access path 
prefix. If the prefix is direct, the referenced object is either 
a global or a local variable. If global, the capability FOR- 
EVER is assigned to the capability field 315.2 of the new 
safe pointer 310. If local, the capability allocated to the local 30 
variable' s stack frame is copied to the capability field 315.2 
of the new safe pointer 310 (frame capability allocation is 
discussed in the following section). If the access path prefix 
is indirect, the temporal attributes are copied from the safe 
pointer 310 named by the access path prefix. 35 

To generate the spatial attribute 313 for the reference, 
conversion means 195 starts with the spatial attribute of the 
access path prefix. The access path prefix spatial attribute is 
either the address and size of the named variable if the 
access path prefix is direct, or the spatial attribute copied 40 
from safe pointer 310 if it is indirect Using this spatial 
attribute, the actual base of the reference is computed from 
the access path suffix, which describes the sub-object being 
referenced. Since all members of the referenced object (i.e., 
the members of any contained structure) are of a known size, 45 
the offset into the object and its size can be computed at 
compile time. In the event that the final term of the suffix is 
a subscript, the spatial attributes are set to the extent of the 
entire array. This technique allows the safe pointer 310 to be 
subsequently manipulated to point to other members of the 50 
array. 

FIG. 10 shows an example of an access path, its decom- 
position into the prefix and suffix, and the C statements 
required to construct the correct safe pointer value in p. In 
this example, & is the reference operator, f->g->h[3].i 103 55 
is the access path prefix, j.k[4] 105 is the access path prefix, 
->104 is the last pointer dereference; h is a pointer, while k 
is an array variable. Calculation of base pointer and size 
demonstrates the widening required if the final term is an 
array subscript, in that the entire access path to the array is 60 
used, rather than just to the fourth element of that array. 
Check Insertion 

If the safe pointer spatial attribute 313 and temporal 
attribute 315 are both correct, complete safety for all pointer 
and array accesses is provided by inserting an access check 65 
before each pointer or array dereference. The term "deref- 
erence" is used as a blanket term for any indirect access — 
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either through application of the dereference operator (e.g., 
or *->' in C) or through indexing an array or pointer 
variable (e.g., *[]' in C). 

The following C-language-like code illustrates one 
embodiment of a memory-access checking function: 



void ValidattAccess(<rype> •addrs { 

if ((starageClass != Global) && IValidCapability(capability)) 

FlagTemporaIError( ); 
if (((unsigned)addr - (unsigned)base) > (soft- 
sizeof(<rype>))) 

FlagSpatiaIErmr( ); 
/* valid access! */ 

) 



This function is parameterized by <type>, the type of the 
safe pointer's referent. FlagTcmporalError() performs 
system-specific handling of a detected temporal access error, 
(e.g., force a core dump). HagSpatialErrarO performs the 
same function, but for a spatial access error. The function 
ValidCapabilityQ indicates whether or not the passed capa- 
bility is currently active (i.e., is in capability store 350). 

The dereference check first verifies that the referent is 
alive by performing a temporal access check including an 
associative search for the referent's capability. If the referent 
has been freed, the capability would no longer exist in the 
capability store 350 and the check would faiL Became 
capabilities are never re-used, the temporal check fails (i.e., 
detects the error) even if the storage has been reallocated. If 
the temporal check succeeds, then the storage is known to be 
alive, and an address bounds check is applied to verify that 
the entire extent of the access fits into the address space 
specified for the referent 

The C-like spatial access check above takes advantage of 
the wrap-around property of unsigned arithmetic to simplify 
the address-bounds check. If the accessed address is prior to 
the start of the array, the unsigned subtraction underflows 
and creates a very large number, causing the test to fail (i.e., 
detect the error). The advantage of this expression over 
traditional address-bounds checks is that it only requires one 
conditional branch to implement. This simplification 
reduces the additional control complexity introduced by 
dereference checks, which can result in better optimization 
results and better dynamic executions. In another 
embodiment, a traditional address-bounds check of the 
form: 

((adclr<basc)Il(a(idr>(base^izfr^izcof(<typc>)))) 

may be used. Such an errmodiment does, however, requires 
two conditional branches (or, in the alternative, extra 
instructions to combine the boolean terms). 

HO. 8a is a detailed flowchart depicting the operation of 
one embodiment of step 440 of FIG. 4. In this embodiment, 
at 161 safe program source-code file is compiled. Block 
161.1 shows the C++ safe program source code being 
compiled and output as C language code. Block 161.2 shows 
the resulting C safe program source code being compiled 
and the results placed in relocatable object files. 

At 163 the relocatable object files are linked with library 
and special safe run-time support to form safe program 
executable code. This safe run-time support includes extra 
checking of attempted accesses to detect spatial and tempo- 
ral access errors. One part of this checking involves con- 
verting unsafe versions of malloc() into safe versions, and 
unsafe versions of freeO into safe versions in the manner 
described above. At 164, the safe program executable code 
file resulting from steps 161-163 above is written to safe 
program file 190*. 
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la another embodiment, the safe pointer conversion is next unique capability. unsafe_jnalloc() and unsafe_jreeO 
integrated into the compiler used by conversion means 195 are interfaces to the system-defined storage allocator, 
and the output 190' is a relocatable object file which is then During allocation, mallocO generates a safe pointer 310 
linked with the other parts of the program. This embodiment using the size and location of the allocation request The call 
allows greater efficiencies because the compiler and con- 5 to NextCapabilityO returns the next available and unused 
version means 195 have more information about what the capability. In this embodiment, NextCapabilityO is imple- 
other is doing. It also provides greater safeguards against mented with an incrementing counter. (An alternative 
programmer circumvention of the memory access checking embodiment uses a pseudo-random number generator (a 
features added by conversion means 195. number generator which generates a sequence of N random 
FIG. Sb is a detailed flowchart depicting execution of the 10 numbers without duplication within the sequence). This has 
safe program in step 450 of FIG. 4. At 810, spatial attributes the advantage of helping to thwart a programmer who might 
313 for data object 321 are created by the compiler. In one try to generate a counterfeit pointer.) The capability is 
embodiment, spatial attributes include values for base type inserted into capability store 350 via the call to 
313.1, *base pointer 313.2, and size 313.3. At 820, spatial InsertCapabilityO. The call to bzeroO clears the entire 
attributes 313 for data object 321 are associated with safe is storage allocation. This action ensures that any pointers in 
pointer 310. For instance, in the embodiment shown in FIG. the untyped allocation are initially invalid (this embodiment 
3, step 820 comprises loading values for base type 313.1, provides that the storage class of Heap and capability 
♦base pointer 313.2, and size 313.3 into their respective NEVER are both assigned the value of 0). 
fields in safe pointer 310. At 830, temporal attributes 315 for Function callocO in FIG. 11 simply calls mallocO as in 
data object 321 are created by the run-time support code. In 20 either case the storage is cleared before it is returned, 
one embodiment, temporal attributes include storage class The implementation of rcallocO in FIG. 11 is slightly 
315.1 and capability 315.2. At 840, temporal attributes 315 more subtle. This function takes an existing storage alloca- 
tor data object 321 are associated with safe pointer 310. In tion and resizes it to the requested size. The reallocated 
the embodiment shown in FIG. 3, step 840 includes loading storage may move for any request, either large or smaller. If 
values for storage class 315.1 and capability 315.2 into their 25 moved, the contents of the new allocation will be unchanged 
respective fields in safe pointer 310. up to the lesser of the new and old sizes. In a safe program- 
When a dereference occurs, a check is made to determine ming environment, the storage must be moved in all cases, 
whether the dereference is spatially and temporally valid. otherwise, there may exist a safe pointer 310 (which cannot 
Such a determination may be made by checking code 340 be located and changed, because nothing connects an object 
called by the execution of a safe operator 330. In such an 30 back to each of the possibly many copies of the pointer that 
embodiment, at 850 checking code 340 verifies that the points to it) whose spatial attribute 313 has an incorrect 
access attempted by the dereference of a safe pointer 310 record of the referent size If dereferenced, this pointer may 
will be within the address bounds defined by spatial flag errors even though the access was valid in the reallo- 
attributes 313. If not, an error notification is made at 855. In cated storage, or worse, the reallocation may have shrunk 
one such embodiment, such a spatial access error leads to 35 the referent, creating unsafe pointers whose referent sizes 
program termination (e.g. a core dump). In another such are too large. Both these problems are solved by always 
embodiment, the error is noted (e.g. posted to the user) and moving the storage. This action will force the program to 
execution continues. update any old pointers to the previous allocation. Because 
After the spatial validity check of step 850, checking code the reallocated storage is allocated under a new capability, 
340 verifies at 8*0 that the attempted access is to an object 40 any stale pointers to the previous allocation will flag errors 
for which a temporal validity number 721 currently exists in if dereferenced. The remaining storage in the reallocation 
capability store 350, If the check fails, an error notification need not be cleared if the reallocated storage is larger than 
is made at 865. Once again, the error can be handled in any the original storage, as the safe call to mallocO returns 
manner, ranging from a warning to a core dump. Control cleared storage. 

men moves to 870 and the program is allowed to access the 45 At calls to freeO, the capability of the allocation 

data object. (contained in the safe pointer temporal attribute 315) is 

Run-Time Support deleted from the capability store 350 by the call to 

The explicit storage allocation mechanism must be DestroyCapabilityO. This embodiment also verifies that the 

extended to create safe pointers. During allocation, a capa- freed storage is indeed a heap allocation and a pointer to the 

bility must be allocated for the storage, and any contained 50 head of the allocation (as this condition is required by 

pointers must be invalidated. At deallocation, the capability freeO). 

given to the storage must be destroyed. The same allocation mechanism is applied to the dynamic 

FIG. 11 shows how this support would be provided for storage allocated in procedure stack frames. When a func- 

mallocO, the storage allocator provided under UNIX tion is invoked, a capability must be allocated for the entire 

InsertCapabilityO, ValidCapabilityO, and 55 frame if it contains any referenced locals. Any pointers 

DestroyCapabilityO are insert, locate, and delete contained in the frame must be set to an invalid state. The 

capabilities, respectively which maintain capability store steps taken to apply the frame allocation mechanism to a 

350. bzeroO clears size bytes of memory starting at p.value. function is illustrated in FIG. 12. 

bcopy() copies min(size, p. size) bytes of storage fromp.base FIG. 12 shows how a C function is modified to include a 

to new.base. FlagDuplicateFreeO is a system-specific func- 60 capability assignment for each frame. In FIG. 12, a proce- 

tion which flags an error indicating that the program dure prologue is inserted before the function-specific code in 

attempted to free a previously heed heap allocation, order to define a frame capability. In the code fragment 

FlagNonHeapFreeO flags an error indicating that the pro- shown, InsertCapabilityO inserts a capability into the capa- 

gram attempted to free memory that is not in the heap. bility store and ZeroFramePointersO ensures that any point- 

RagNonOriginalFree() flags an error indicating that the 65 ers in the procedure stack frame are initially invalid by 

program attempted to free memory without using a pointer clearing the frame storage (ZeroFramePointersO serves the 

to the head of the allocation. NextCapabilityO returns the same purpose as the call to bzero() in mallocO; it is a 
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system-specific function which clears all pointers in the contents of type field 211. *Value field 912 specifies the 

newly allocated stack frame). In a system where stack-frame memory address of the data object 321 to which safe pointer 

allocations are strongly typed, ZeroRramePointersO can be 910 refers. *Value field 912 would contain a copy of the 

implemented as a function which simply makes NULL contents of *vaiue field 212. Spatial attributes 913 contains 

assignments to all the frame pointers. DestroyCapabilityO 5 tbc spatial attributes for the data object 321 pointed to by 

deletes a capability from the capability store. safe pointer 910. Base type 913.1 specifies the type of the 

NextCapabilityO returns the next unique capability. object 321 to which safe pointer 910 points. The type 

If the language supports non-local jumps (e.g., longjmpO associated with the pointer value in *base field 913.1 is 

in C), the run-time support mast delete the frame capabilities generally identical to that specified in type field 911, and 

of any elided function frames. This operation can be simply 10 need not oe included for computer programming languages 

and portably implemented if the local capability space and which do not require specification of a type for *base field 

heap capability space are kept disjoint, and function frame 9&J2. *Base field 913.2 specifies the address of the lower 

capabilities are allocated using an incrementing counter. The bound of the data object 321 to which safe pointer 910 

allocation of frame capabilities then becomes a depth-first refers. Size field 913.3 specifies the size of the data object 

numbering {Dragon: 86} of the dynamic call graph. When a 15 321 to which safe pointer 910 refers, 

non-local jump occurs, all elided frame capabilities between Temporal attributes 915 specify the temporal attributes 

the source frame and destination frame are deleted by used to check accesses to data object 321. Storage class field 

removing all frame capabilities in the capability store that 915.1 specifies the storage class of the data object 321 to 

are larger than the frame capability of the destination frame. which saf e pointer 910 refers. Capability field 915.2 speci- 

This mechanism only works if the source and destination 20 fies a capability value associated with the data object 321 to 

frames are on the same call stack— this stipulation may not wh *ch safe pointer 910 refers. Backpointer 917 comprises a 

be true in all cases (e.g., coroutine jumps). pointer to a single capability store element 920 which 

Capability Store specifies the capability of the data object 321 to which safe 

In one ernbodiment, capability store 350 is an associative pointer 910 points. Free element list 930 is a list of free 

memory containing the capabilities of all active memory. It 25 capability-store elements 920. Chain pointer 932 points to 

is implemented as a hash table with the capability as the hash the head element in a chain of free capability store table 

key. Accesses to capability store 350 exhibit a great deal of elements 920. In this embodiment, free element list 930 will 

temporal locality, so moving accessed elements to the head be checked when the safe program 190' needs a capability- 

of the hash table bucket chains is likely to decrease average store element 920, and if none are found, then 1024 capa- 

access time. 30 bility store elements 920 are created and linked to chain 

FIG. 7 is a block diagram of a hash table implementation pointer 932. 

of capability store 350. In FIG. 7, hash index table 710 is a Capability store elements 920 each comprise a temporal 

table of pointers to capability store table elements 720. In validity number 921 corresponding to the current temporal 

one embodiment, hash index table 710 contains 1024 validity of the data object 321 associated with this capability 

entries. Hash index 701 is an index into hash index table 35 store element 920, and a chain pointer 922 used to chain this 

710. Hash index 701 is derived from a capability number capability store element 920 when it is on the free element 

that checking code 340 requests a search for in capability ^ 

store 350. In this embodiment, hash index 701 is calculated 10 this embodiment, a temporal validity check comprises 

by shifting said capability number right 16 bits, exclusive- comparing temporal validity number 921 (accessed using 

ORing the result with the original said capability number, 40 backpointer 917) to the value in safe pointer capability 

and then masking that result to leave only the low-order 10 915 - 2 * If temporal validity number 921 matches safe pointer 

bits, which is then used to select one of the 1024 entries in capability 915.2, then the access is allowed; otherwise an 

hash index table 710. error k flagged- Any form of matching could be used and fall 

Chain pointers 711a-711w each point to the head element within the scope of this invention. For instance, an error 

in a chain of capability store table elements 720. In one 45 could be flagged If temporal validity number 921 does not 

embodiment, capability store 350 contains up to 1024 chains equal the value in safe pointer capability 915.2, if temporal 

of capability store table elements 720. The capability num- validity number 921 does not equal the negative of the value 

ber for every capability store table element 720 on a par- to saf e pointer capability 915.2 or any other matching 

ticular chain will have the same value for its hash index 701. mechanism available to one skilled in the art of computer 

Thus by searching down this chain, one can find this 50 progr ammin g. 

capability if it exists in capability store 350. A *chain pointer When dynamic storage data object 321 is deallocated, its 

711 with a value NULL indicates that no capability store capability stare element 920 is reclaimed by destroying its 

table elements 720 are chained to this chain pointer 711 (that temporal validity number 921 (in one embodiment, by 

is, that there are no entries on the chain of capability store assigning predefined capability NEVER to it) and chaining 

table elements 720 for this hash index 701). 55 it to the top of free list 930. Since safe pointers 910 may still 

Each capability store table element 720 comprises a access the storage location that once held a capability store 

temporal validity number 721 and a *chain pointer 722. A element 920 (e.g., an errant reference such as a reference to 

♦chain pointer 722 with a value NULL indicates that no a deallocated data object 321), any storage which holds a 

more capability store table elements 720 are chained to this capability store element 920 must never be used for any 

capability store table element 720 (that this element is the 60 other purpose. 

end-of-chain). Implications of Complete Error Coverage 

FIG. 9 is a block diagram depicting an alternative embodi- The above safe programming technique can detect all 

mcnt of safe pointer 310 of FIG. 3, and capability store 350 memory access errors provided that the following qualifi- 

of FIG. 7. In FIG. 9, safe pointer 910 is the alternative cations hold: 

embodiment for safe pointer 310 of FIG. 3. T^pe field 911 65 a) storage management must be apparent to the translator; 

specifies the type of data object 321 to which safe pointer b) the referents of all pointer constants must have a known 

910 refers. Type field 911 would contain a copy of the location, size, and lifetime; and 
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c) the program must not manipulate the object attributes Finally, a requirement that the safe program must not 

of any pointer value. manipulate the object attributes of any pointer value protects 

Error coverage is limited to storage management con- these object attributes. If a program can arbitrarily manipu- 

trolled by the safe programming run-time system. If a late me object attributes of a ointer val ^ ^ 

program implements a domain specific allocator at the user 5 ^ be subverted For e * fe chan ^ the * t 
level, some memory access errors, as viewed by the . * „ . , , ~. , 77\ T ~T . T 

programmer, can be missed. dass of a P ointer hom Global to Hca P 311(1 mcn frccin g mc 

Consider, for example, a fixed-size storage allocator. If a pointer would likely cause disastrous effects under the 

program relies heavily on a fixed size structure, storage above-described storage allocation scheme, and these effects 

requirements and allocation overheads can be greatly would not be detected by the described safe programming 

reduced by applying a fixed-size allocation strategy. At the 10 framework. 

program level, the fixed-size allocator calls the system In the safe programming technique described above, 

allocator (e.g., malloc() or sbrkO in the C language) to object attributes attach only to pointer values. In such a 

allocate a large memory allocation. The fixed-size allocator scheme, the danger exists that pointer values will be manipu- 

then slices the memory allocated into fixed-size pieces with lated through the use of recasts or unions. With a recast, it 

a zero overhead for each allocation. Under such a scheme, 15 i s possible to type storage in the referent first as a non- 

the safe programming techniques described above can only pointer value, manipulate the storage arbitrarily, and then 

be useoUo ensure ttat accesses to rccast me rcfcrent st0 to a unsafc) pomtCL 

is within the overall memory allocated. There is no mecha- mth a unionT k & possible to CTeate a ^ ^ ^ 

msm for intcrshcc verification Hus iinprecision occurs onc Md ^ ^ manipulatc ^ ^ attributcs of mc 

because die translator can not arsambiguate ft* user level 20 ^ yalue to h ^ field of 

storage allocation actions from other pointer related program the union ^ 

activities * 

. . ^ , One way to prevent this kind of manipulation is to attach 

Witt! somepxogninimerintervention flus proMcmcanbe ^ tQ ^ b of ~ 

olyed. Any useful safe compiler unplementation has to ^ of * ^ ent kvcatioili objec f attributcs 

S™Z LVi^™ Pr ° 8rammerS miaS "*' " ^ 25 are assigned to each byU of allocated storage. For types 

through which systems programmers can constoict and j ^ one 5ytCi ^ object attributes ^copied to dl 

S^S^lf? T 313 40 f W^ - !*? 315 otoer storage hold£ the allocation. 

of safe pointer 310. In the case of tie fixed-size storage ^ * ne be mai ^ Mei at £ e ^ 

allocator, the programmer would specify the base and size of leyeL ' y & 

the fixed-size allocation ifor ^spatial attribute 313 Storage 30 m reaUty) for . Ven behaved" programs a high margin of 
class 315.1 and capability 315.2 of a new safe pointer 310 * ^ V . . . , _ . . , . ,f =v 

are generated fronTthe 7afe pointer 310 th« poZ to fte sa ** can be provided by attaching object attributes only to 

block from which the f^edX Son wafd^ g*" ^ at^^t/Z^r ™H St 

Without the qualification listed in b) above, the compiler P ou,ter .\ sdnes ■» »w* created from or manipulated as 

_ , . UJI ; M u ™" u '""'"' " ; , 17 "J"*"^ non-pointer values. In one embodiment of the present 

may not be able to generate correct object attributes for a 35 ^ tf Wok ^ ^ ( ^ ^ 

pointer constant For example, device driver code typically recast)( a ' safe operating within conversion means 

creates pointers to device buffers and registers by recasting 19S ^ a con^atjvTapproximation as to the intended 

an integer to a pointer value. The translator has no way of MfeKBt of ^ new in ff ^ 

toowmg the s^ and lifetime of the referent; Aus program ^ ^ 19S is within an area of 

safety cannot be maintain ^ in C. the onlv well-defined 40 ■- ^ . „ . . . ^ „ . . , 

• . _ . ' „ ' , ~r ' 7 " „ „ live storage, access is allowed to proceed. Note that in such 

pointer constants are NULL, stogs, and functions. For aU an einbodiment, since the program may have manipulated 

other cases, thu problem can be avoided by supplying the ^ ^ value t0 m f before or after the intended 
programmer with an API suitable for specifying the size and -jLl t tn u ♦» . .,„.._ T~ ,„ 

Retime of problematic pointer constants. "^f* ^ OT 10 tec T^ tt to » ^P^to value, the new 

(There are two ways^nctions cTbe integrated into a 45 PO^^c^^md^hycymMt Co^ctsion 

safeprognmimmgframewcfkffon^ T*WTT , W Wn, 5™T 

^^B— ifiuoiuvwv**. vut^uiuw uioi, * iwwuuu ^ prevent unintentionally (e.g., through incorrect use of a 

pointerscanoriybeassign^ union) manipuhrtion of p^rinto values 
intermix with safe pointers and may remain simple pointers. . . . i *T 

Then, the only check required at dereferences is a non- Optimizing Dereference Checks 

NULL check. If cast operators may be applied to function 50 In foe interest of performance, it may be possible to elide 

pointers, they can be represented by a safe pointer 310 with (^P) dereference checks and still provide complete pro- 

a storage class of "Function" (e.g., storageClass 315.1 is gram safety. If a dereference of pointer value v has not been 

assigned "Function"). Then, at dereferences of function invalidated by some program action, a subsequent, equiva- 

pointers, the checking code must ensure the pointer value is l cnl check may be skipped. 

a function pointer and has not been changed (e.g., storage- 55 This check optimization can be implemented either at 

Class 315.1= tt Function" and value 313.2=base 312); at run-time or at compile- time. Run-time check optimization 

non-function pointer dereferences, the storage class 315.1 of has the advantage of being more flexible. Only the checks 

the safe pointer 310 must be checked to see that it is not a absolutely required to maintain program safety need be 

function pointer (e.g., storageClass 315.11 !="Function M ).) executed. However, the cost for this precision is to keep 

Qualification b) does not, however, preclude the use of 60 extra safe pointer state information which must copied, 

recasts from non-pointer variables to pointer variables. To maintained, and checked at each dereference. Compile-time 

successfully support these operations, object attributes must check optimization, on the other hand, is less flexible 

be attached to all variables. In general, to provide complete because the safe program 190' must constrain the decision to 

safety, object attributes must be attached to any storage that elide a check to all previous possible executions leading to 

could hold a pointer value. It should be noted, however, that 65 a program point The advantage of compile-time check 

most **well behaved" programs will only require pointer optimization is that no additional overhead is required at 

variables to carry object attributes. run-time to determine if a check may be elided. 
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Run-Time Check Optimization 

A framework for dynamically eliding spatial and temporal 
checks has been designed and implemented. Spatial checks 
have no side effects, thus "memoization" {Field:88} (or 
function caching) can be employed to elide their evaluation. 5 

FIG. 16 depicts an alternative embodiment of safe pointer 
310 of FIG. 3. Like pointer 310, safe pointer 170 includes a 
type field 171 and a *value field 172. l^pe field 171 specifies 
the type of data object 321 to which safe pointer 170 refers. 
*Value field 172 specifies the memory address of the data 
object 321 to which safe pointer 170 refers. Like safe pointer 
310, safe pointer 170 includes a spatial attribute field 173 
and a temporal attribute field 175. Like spatial attribute field 
313, spatial attribute field 173 includes a base type field 
173.1, a *base address field 173.2 which specifies the 
address of the lower bound of the data object 321 to which 15 
safe pointer 170 refers and a size field 173.3 which specifies 
the size of the data object 321 to which safe pointer 310 
refers. However, safe pointer 170 also contains last address 
field 173.4 containing a copy of the effective address of the 
last dereference of this safe pointer 170. 20 

Like temporal attribute field 315, temporal attribute field 
175 specifics the storage class of the data object 321 to 
which safe pointer 170 refers. Like temporal attribute field 
175, temporal attribute field 175 includes a storage class 
field 175.1 which specifies the storage class of data object 25 
321 (e.g., Heap, Local, or Global), and a temporal capability 
field 175.2 which specifies a temporal capability number 
representative of a capability value associated with the data 
object 321. In addition, temporal attribute field 175 contains 
free count field 175.3 which contains a copy of the global 30 
free counter 179. In this embodiment, the safe program 190* 
contains one global free counter 179 which is incremented 
each time a free() function is called to deallocate memory 
(i.e. to deallocate a data object 321 from the heap). In one 
embodiment, the safe freeO function increments the global 35 
free counter 179 each time it destroys a capability (a 
capability is destroyed by invalidating a temporal validity 
number 721 in a capability store table element 720). Each 
time a memory access is made using pointer 310, its free 
count field 175.3 is loaded with a copy of the current value 40 
in global free counter 179. The next access using safe pointer 
170 compares the value in global free counter 179 with the 
value in free count field 1753 of safe pointer 170; if the 
values are the same, the safe program 190* can assume that 
the value in capability field 175.2 is still valid. 45 

In an alternative embodiment comprises a separate global 
free counter 179 exists for each type of object which can be 
specified by storage class field 1751. This embodiment 
improves the efficiency of the optimization scheme by 
reducing the number of events which will cause the global 50 
free counter 179 for a particular type of object to be 
incremented.) 

At any dereference, the spatial check may be elided 
(skipped) if the effective address of this dereference is the 
same as the value stored in last address field 173.4. This test 55 
is shown in FIG. 15 in the if statement surrounding the 
address-bounds check. (In an alternative embodiment (not 
shown), it may be useful to "memoize" more than one set of 
last-effective-address operands. In yet another alternative 
embodiment (not shown), both the effective address of the 60 
last dereference (le., use of ***), and the effective address of 
the last subscript operation (i.e., use of *[]*) are memorized. 
Changes in the former are tracked with a single "dirty" bit 
Changes in the latter are tracked by retaining a copy of the 
last index applied to the pointer value.) 65 

The C-like function of FIG. 15 is parameterized by 
<type> which is the type of safe pointer referent 321. In FIG. 
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15, FlagTemporalErrorO performs system-specific handling 
of a temporal access error, e.g M to force a core dump. 
FlagSpatialErrorO performs the corresponding function for 
a spatial access error. The function ValidCapabilityO indi- 
cates whether or not the passed capability is currently active, 
i.e., is in the capability store 350. The variable currentFree- 
Count is a global counter incremented each time storage is 
deallocated. 

To elide temporal checks, a copy of a global counter, 
incremented when storage is deallocated, is kept in safe 
pointer 310. If this counter, which is called the free counter, 
has not changed since the last temporal check, the referent 
has not been freed and the temporal check can be safely 
skipped. In one embodiment, the free counter does not 
increment when a procedure returns. Rather, the checking 
code always performs temporal checks on pointers to local 
variables. This strategy works very well in practice became 
procedure returns are quite frequent, while the use of local 
referents is generally infrequent In an alternative embodi- 
ment two counters are used: a free counter for use with heap 
objects, and a return counter for use with the local objects of 
function call returns. 
Compile-Time Check Optimization 

A compile-time optimization framework embodiment will 
be described next This embodiment's algorithm implements 
a forward data-flow framework similar to that used by 
common subexpression elimination {Dragon:86}. In this 
embodiment, because of the simplified address-bounds 
check, there is no need to split the optimization into upper- 
bounds-check and lower-bounds-check elimination. A flow 
graph of this compile-time check optimization algorithm 
embodiment follows: 



Input A flow graph G with blocks B with gen[B J and killfBJ 
computed for each block B t e B. genjBJ is the set of 
check expressions generated in B t . killfBJ is the set of 
check expressions killed in The entry block is Bj. 
Output: A flow graph G with redundant checks deleted 
Method: The following procedure is executed twice, once for 
spatial check optimization and again for temporal 
check optimization. 
/* initialize out sets */ 
in(B,] = (3; 
outtDj] = gcnIBx]; 
U = u genJBi]; 

BiVB 
for Bi e B - Bi do 

outfBjJ = U - kill[Bi]; 
/* compute availability of checks, in sets */ 
change = true; 
while change do begin 
change = false; 
for B e B - Bi do begin 
in[BJ = n out[Pl; 

Pe Pred[Bl 
oldout = outEBJ; 

out[BJ - genfB-J u (mCBJ - JriUfB J); 
if outlBJ * oldout then 
change = true; 

end 

end 

/* elide redundant checks */ 

forBjG B -Bj do begin 

for c € genfBJ do begin 
if c in[BJ then 
elide check c; 

end 

end 



An embodiment of the optimization algorithm is shown in 
the above flow graph. The algorithm is run twice, once for 
optimization of spatial checks and again for temporal 
checks. The algorithm executes in three phases. 
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In the fast phase, the algorithm seeds the data-flow -continued 

analysis by approximating all out sets. For all blocks except — — — — — — — — — — ^— — — - — — 

the entry block, the value of OUt[Bj is set to all check unsigned short capability*/* capability is always unique */ 

expressions less those killed by the block B f , i.e., U-kill[B J. ^^{^ 

For the program entry block, B„ this embodiment must 5 values null; 

assume that no checks are available, hence, in[B J is set to base » NULL; 

empty and out[B J is set to the checks generated in the entry sizc ~ 

DlOCK 0 ( . capability = NEVER; 

In the second phase, the data-flow framework is solved to } 

determine where check expressions reach in the program 10 /* dereference */ 

For a check expression to reach a node B„ it must be 1 ^,? m ^? oW ?_ { rv,K 

available at B, for all executions, that is,it must be available (( ^^ iva^bii^capabiuty)) 

in the out sets of all predecessors to block B . This require- FiagTemporaiEiiot( ); 

ment is precisely why the confluence operator is intersec- if (({uns^gned)vahie - (unsigned)base) > 

tion. After the data-flow computation converges on a is „ ^ 

solution, i.e. , change=£alse, the set in[BJ contains all checks * v ^^ ia ^ ror< * 

that reach block B,. } 

In the third phase, the in sets arc used to elide redundant /* pointer additioa */ 

checks. Checks may be elided wherever a lexically identical sp<l^^opcratorKint addend) { 

(or equivalent, if value numbering {Rosen:88} or equality 20 . J^J^^ " °° ^ *' 

tests { Alpem:8 8 } are applied) check is available in the block return p; 

(Le., the same check is in the in set of the block). } 

The denning feature for each analysis (spatial and C ^^J^^^f^ *' 

temporal) is the specification of what constitutes a kfll. A m °^^2^ le ; { 

spatial check is killed by any assignment to a check operand, 25 } 

which includes assignment to the pointer variable or any of } 

the operands of the index expression (if the pointer was — — — — ^ — _ _ 

indexed in the check expression). A temporal check is killed ^ ^ embodiment tested, all explicit storage allocation, 

by any free of the referent storage. If the referent of a free i. e ., calls to mallocO and freeO, called wrapper functions 

can be determined to be different than the check referent 30 which create safe pointers from the standard library routines. 

(e.g. f through alias analysis), the free need not kill the check. The safe mallocO implementation clears all allocated 

While performing these analyses, the algorithm must also storage, so any contained pointers start in the invalid state, 

be wary of kills that may occur through function calls or If a local object in a function is used as a pointer referent the 

aliases. In either case, a conservative approximation must be function was rewritten to allocate a capability for the frame 

made if insufficient information is available and assume that 35 in the manner described above. Any pointer in the stack 

a kill does occur. frame of the function was initialized to an invalid state in the 

Experimental Evaluation constructor of the C++ safe pointer class and assigned object 

The invention's safe programming methodology was attributes generated from the decomposed access path, 

evaluated by implementing a semi-automatic source-to- la evaluating performance, a lower bound on the number 

source translator and examining the run-time, code and data 40 of checks required was computed for compUe-time optimi- 

size overheads for six non-trivial programs. For each zation by modifying the safe pointer implementation to 

program, performance was analyzed both without optimi- make superfluous stores to a global scratch pad array during 

zation and with run-time resolved optimizations. Lower dereferences. The location of the superfluous stores indi- 

bound statistics for the efficacy of compile-time optimiza- cated whether or not a particular program point required a 

tion were also generated through the use of a trace analyzer. 43 check at run-time. The run-time-resolved optimization 

Experimental R-amework scheme described above was used to determine if a check 

FIG. 17 shows the experimental framework. C programs was required at run-time. The superfluous stores were 

are translated to their safe counterparts by first rewriting all tracked by an address trace analyzer which tabulated, by 

pointer and array declarations, calls to mallocO and freeO. address, how many checks were executed. When the pro- 

and references (use of the *&' operator) to use the Safe-C 50 S™ 111 terminated, the total number of program points that did 

macros. These macros, when passed through the C prepro- not execute any checks were computed, as were the total 

cessor ("GPP"), produce either the original C program or a number of dynamic checks elided at these program points. 

Safe-C program. A Safe-C program has all pointer and army These results form a lower bound on the number of static 

declarations changed to type parameterized C++ class dec- ( m ^ code) and dynamic (executed at run-time) checks 

larations. Using operator overloading in the C++ class 55 required for a compile-time optimized program. With true 

definition, the extended safe pointer and array semantics are compile-time analysis, the actual number of checks required 

implemented as described above. may be higher because 1) other inputs may require checks at 

The following code shows a portion of an unoptimized program points that did not execute any checks, and 2) 

safe pointer implementation; limitations in static analysis, e.g. f imprecisions due to pro- 

60 gram aliases, may force a compile-time optimizer to make 
conservative assumptions and add checks where they may 



template <claas Typo not be needed. To increase the effectiveness of the lower 
class sp { y* safe pointer representation */ bound study, the results of four separate inputs were corn- 
Type • value; /* native pomter */ W +A 
Type *ba$e; /* base address of object */ Dined. 

unsigned long size; /* size of object in bytes */ 65 The lower bound results are not a strict lower bound, 

char storageClass; /• type of allocation */ Other static analysis techniques, e.g., range analysis {Har- 

rison:77} or program restructuring, could decrease the num- 
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ber of static checks required. However, for the proposed 
compile-time optimization framework without program 
restructuring, the lower bound results are a strict lower 
bound. 

Analyzed Programs 5 

Six programs, selected became each exhibits a high 
frequency of indirect references, were analyzed Table 2 
below details the programs that were analyzed. For each, the 
frequency of dereferences in the program text (Insts per 
Dereference/Static), and the dynamic frequency of derefcr- Q 
ences executed (Insts per Dereference/Dynamic) are shown. 

The Class column classifies each program according to its 
spatial and temporal complexity. The spatial complexity, S, 
indicates the frequency of pointer arithmetic or indexing: 
either high (S+), medium(S) t or low (S-). The temporal 
complexity, T, is an indicator of how often the program frees 
storage. If this factor is high (T+) t the program frees storage 
throughout execution, if low (T-), the program never frees 
storage (or only at program completion). 



15 



TABLE 2 



Instructions/ 
Program Dereference 

Name Static Dynamic Class 



Description 



Aiugnun 


1063 


7.6 


S+,T- 


auagram generator 


Backprop 


148.5 


"8.9 


S+.T- 


neural net trainer 


GNU BC 


15.5 


7.6 


S.T+ 


arbitrary precision 
calculator 


Min-Span 


48.7 


5.9 


S-,T+ 


fmn spanning tree 
computation 


Partition 


62.4 


3.7 


S,T- 


graph partitioning tool 


YACR-2 


37.1 


14.0 


S+,T- 


channel router 
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All programs were compiled and executed on a DECsta- 
tion 3 100 using AT&T USL cfront version 3.0.1. The output 35 
of cfront (C code) was compiled using MIPS cc version 2.1 
at optimization level '-02*. All instruction counts were 
obtained with QPT {QPT:93}. 

For all analyses, object attributes were only attached to 
pointer values. A 15-byte safe pointer 310 (275% overhead) 40 
was used in the unoptimized case: 4-byte pointer value, 
4-byte base, 4-byte size, a 1-byte storage class specifier, and 
a 2-byte capability. For run-time resolved optimization, a 
1-byte dirty flag, a 4-byte last index, and a 2-byte free 
counter were added for a total size of 22 bytes (450% 45 
overhead). Due to a bug in the C++ compiler, sizeof() in the 
safe-pointer implementation could not be used if the referent 
referred to itself; as a result, be, minspan, and partition all 
required the size of the referent to be stored in the safe 
pointer 310, which added a 4-byte overhead for these 50 
programs. There were no space overheads for array 
variables, as all required object attributes are known at 
compile-time. Only the actual program code was rewritten, 
all system library routines remained unchecked. However, 
interface checking was performed. Whenever a system 55 
library is called, any pointer arguments are validated against 
the time and space bounds expected by the library routine. 
For example, if a call were made to fread(), the interface 
check would ensure that the destination of the read was live 
storage and that the entire length of the read operation would 60 
fit into the referent 

Results of Execution overhead measurements. 

For the nm-timc-optiniized executions, the normalized 
instruction counts range from 2.3 (yacr) to 6.4 (be). This 
overhead reflects program performance without any 65 
compile-time optimization. While this performance degra- 
dation is probably acceptable for the development cycle of 



short or medium length program executions, it may still be 
prohibitively expensive for very long running programs, and 
it is certainly too costly a price to pay for in-field instru- 
mentation of a program. Examining more closely the break- 
down of the execution overheads yields much insight into 
how the performance of the checking methodology could be 
improved. 

For each program, the overhead costs were broken down 
into five categories. 

For be, minspan, and partition, run-time check optimiza- 
tion paid off with a slightly lower execution cost for spatial 
checking. For anagram, backprop, and yacr, adding run-time 
checks resulted in a higher cost for spatial access checking; 
and in the case of backprop, a higher overall execution 
overhead. 

These programs demonstrate the trade-offs involved in 
providing run-time check optimization. Run-time optimiza- 
tion adds the extra overhead of copying, maintaining, and 
checking the extra safe-pointer state. If this added overhead, 
plus the overhead of the required checks, is greater than 
doing all the checks, there is no advantage to run-time check 
optimization. With faster checks, compile-time 
optimization, and spatially complex programs, this trade-off 
becomes even more acute. 

Since anagram, backprop, and yacr must execute many of 
their checks, they do not benefit from the run-time optimi- 
zations. For yacr, the effects are much less pronounced 
because dereferences are much less frequent (as shown in 
Table 2, above). Compile-time analysis will, therefore, be 
ineffective fox most pointer and array intensive programs, as 
they are either spatially complex or rely heavily on dynamic 
storage, two properties which reduce the effectiveness of 
compile-time spatial check optimization. 

The second effect to observe when comparing the opti- 
mized to unoptimized execution costs is that the greatest 
benefit of run-time check optimization always comes from 
eliding temporal checks. In fact, adding run-time optimiza- 
tion for temporal checks caused a significant decrease in all 
execution overheads except backprop. There are two aspects 
to this result First, temporal checks are very expensive 
(requiring an associative search), so eliding one has a great 
performance advantage. Second, the run-time check optimi- 
zation of temporal checks is very effective. Temporal checks 
are rarely required, even for be and minspan, both of which 
free storage often. In the case of backprop, adding run-time 
optimization for temporal checks resulted in an increased 
execution overhead backprop has only one dynamic object, 
an array, so temporal checking is relatively cheap without 
any optimization (the capability is always at the head of the 
hash bucket chain). In this case, the cost of maintaining the 
extra storage required for the free counter outweighs the cost 
of executing all temporal checks. 

The lower bound analyses also suggest that few program 
points require temporal checks; after inspecting the code, it 
is apparent that compile-time analysis (even when not 
inter-procedural) for eliding temporal checks will be very 
effective on these programs. Few of the dominating loops 
and procedures contain procedure calls or calls to ftetQ. 

Adding checking code reduces the effectiveness of many 
traditional compiler optimizations. All check code is placed 
in-line except for calls to ValidCapabilityO and abort(). 
These functions are both externally defined, so the compiler 
must make conservative assumptions as to what actions they 
take. This conservative approximation has the effect of 
limiting the effectiveness of many optimizations such as 
invariant code motion, register allocation, copy propagation, 
and common subexpression elimination. Neither the 
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ValidCapability() nor the abort() function produces any 
side-effects for normal executions. Hence, better compiler 
integration, i.e., providing a special channel of communica- 
tion between the safe program generator and the compiler 
optimizer, would certainly increase the performance of the 
safe executions. It should be noted that many compilers, e.g, 
GNU w gcc n , already understand the special semantics of 
abort() and use inter-procedural information related to this 
command to improve optimizations. One should be able to 
achieve the same results for ValidCapability(). 

Text size overheads were measured. All checking code, 
except the capability routines and what the C++ compiler 
extracts for expression simplification, is placed in-line into 
the original program text. Surprisingly, the text overheads 
were quite small; 35% to 300% for the unoptimized 
executables and 41% to 340% for the run-time optimized 
programs. The text sizes for the run-time optimized pro- 
grams were larger due to additional code required for 
maintaining, copying, and checking the extra object 
attributes. There is a strong correlation between static deref- 
erence density and the resulting text overhead. 

The data size overheads were measured as the total size of 
initialized (.data) and uninitialized (.bss) data segments plus 
the size of the heap segment when the program terminates 
execution. The data size overheads on the stack were not 
measured. All programs, except minspan, have data size 
overhead below 100%. backprop has the lowest overhead 
(less than 5%) because most of its storage is large global 
arrays which do not require any object attributes, minspan 
has the highest overhead (330%), which stems from the high 
density of pointers in its heap allocations, most of which 
contain eight pointers and three integers. Some of the 
run-time optimized programs have slightly larger overheads 
due to the additional object attributes. 

To summarize the main points of the measurement results: 

Execution overheads, even without compile-time 
optimization, are low enough to make the methodology 
useful during program development However, the 
overheads are not likely low enough that programmers 
would release software with checking enabled. 

The largest contributing factors to execution overhead are 

1) safe-pointer structures are not register allocated, and 

2) many traditional optimizations fail with the addition 
of checks. Other performance losses are attributed to 
the C++ compiler simplifying expressions through the 
use of static functions, and, due to a bug in the C++ 
compiler, the need to include the type size of the 
referent in the object attributes. None of these difficul- 
ties are without recourse, however. Better integration 
between the safe compiler and the optimizer could fix 
most problems. 

Dynamically eliding spatial checks is generally 
ineffective, primarily because maintaining the extra 
state, and checking it, quickly outweighs the cost of 
executing all checks. The spatial check is very cheap to 
execute, and spatially complex programs tend to 
execute most of the checks anyway. 

Temporal checks, on the other hand, are very expensive to 
perform and are rarely required, so run-time optimiza- 
tion shows to be beneficial in most cases. 

The text and data size overhead are generally quite low. 
The text overheads for all programs with run-time 
optimization, range from 41% to 340%, with all but 
two below 100%. Data overheads range from 5% to 
330%, with all but one below 100%. Run-time opti- 
mized executions have slightly larger text and data 
sizes. 
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Two examples of the operation of the safe programming 
technique will be described next. FIG. 13 shows a spatial 
access error, and FIG. 14 illustrates a temporal access error. 
Safe pointer values are specified as a 5-tuple with the 

5 following format: [{value}, {base}, {size}, {storageClass}, 
{capability}], x indicates a don't care value. In the example 
shown in FIG. 13, a spatial access error is flagged when the 
program dereferences a safe pointer whose value is less than 
the base of the referent. In example shown in FIG. 14, a stale 

10 pointer, q is dereferenced. Even though the same storage has 
been reallocated to p, the capability originally assigned to q 
has been destroyed during the call to freeO; thus, the 
temporal access error is detected. 
The safe prograrnming technique described above is sig- 

15 nificantly more reliable than the reference-chaining 
technique, became its correctness does not rely on tracking 
pointer values. In addition, it is not easy for the programmer 
to subvert the checking mechanism through, e.g., recasting 
and type-less calls to free() (the memory-deallocation 

20 function), since recast pointers carry a copy of the original 
pointer's attributes which will still be checked when used to 
access the object. Furthermore, storage-leak errors is 
enhanced, even in the presence of circular references (e.g., 
where a chain of pointers- to-pointers eventually points back 

25 to an earlier pointer in the chain). 

By using a conservative collector in conjunction with the 
safe prograrnming technique described above, one can make 
the process of detecting storage-leak errors intrinsically 
mare reliable (through the elimination of false leaks and by 

30 reducing the possibility of false hits). False indications of 
storage-leak errors (called "false leaks") cannot occur under 
the safe programming technique described above. The base 
field always holds a pointer to the head of the allocation, and 
the program cannot manipulate this value. The problem of 

35 "false hits" (when non-pointer values appear to be pointers 
which reference areas in heap storage, thus missing actual 
storage-leak errors) can also be addressed, by checking for 
safe-pointer invariant information in any memory item 
which may be a pointer referencing areas in heap storage. 

40 One advantageous test is to ensure that both the capability 
and the free-counter values of the possible reference are 
valid. If an incrementing counter is used for each, each value 
should be less than the next unused counter value. 
As noted in the background section, techniques which 

45 operate on object code only (such as Purify) cannot detect all 
memory access errors, because it is too difficult to determine 
the intended referent for each attempted access. The safe 
programming technique described above, on the other hand, 
can detect all memory access errors because it tracks not 

50 only the state of storage, but also the intended referents of 
all pointer values. 

Although the safe programming technique described 
above can be implemented in many different programming 
languages, it is not portable across languages; that is, each 

55 given implementation must be tailored for a specific lan- 
guage. The safe prograniming technique described is quite 
portable across different platforms, however, especially if 
implemented as a source-to-source translator. Further, the 
technique is not limited by the language or the expressive- 

60 ness of the language; that is, it can be applied successfully 
to compiled or interpreted languages with subscripted and 
mutable pointers, local references, unions, and explicit and 
type-less dynamic storage management. 
The safe programming technique described above finds 

65 both spatial and temporal access errors. It is amenable to the 
use of run-time and compile-time optimizations through 
which access checks can be omitted ("elided"). Further, 



04/27/2004, EAST Version: 1.4.1 



5,644,709 



31 



32 



30 



since the technique uses compile-time instrumentation, 
resource requirements are significantly lower than those 
required, for example, by the technique used in CodeCenter. 
Compile-time instrumentation also allows the safe program- 
ming technique described in the invention to employ 5 
compile-time optimizations. 

It is to be understood that the above description is 
intended to be illustrative, and not restrictive. Many other 
embodiments will be apparent to those of skill in the art 
upon reviewing the above description. For instance, a 
method suited for a compilation process using the invention 
is described above, but a person skilled in the art could use 
an analogous method in an interpretation or translation 
process by using the invention to achieve a safe program in 
those environments. The scope of the invention should, 
therefore, be determined with reference to the appended 15 
claims, along with the full scope of equivalents to which 
such claims are entitled. 

A listing of the references cited within are attached as an 
Appendix. 
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What is claimed is: 

1. A method for detecting memory access errors which 
occur while executing a computer program, wherein the 
computer program includes a pointer to a data object, said 
method comprising the steps of: 

providing a spatial attribute for said data object, said 
spatial attribute defining an address space within which 
valid accesses may be made to said data object; 
associating said spatial attribute with said pointer, 
providing a temporal attribute for said data object, said 
temporal attribute denning a period of time within 
which valid accesses may be made to said data object, 
wherein the step of providing said temporal attribute 
comprises the steps of: 

providing a temporal validity number having a tempo- 
ral validity number location; and 
providing a temporal capability number; 
associating said temporal capability number with said 
pointer; 

providing a dereference to said pointer, 
determining if said dereference falls outside said address 
space; 

if said dereference falls outside said address space, flag- 
ging a spatial error, 

determining from said temporal capability number and 
said temporal validity number whether a ternporal error 
has occurred; and 

if a temporal error has occurred, flagging said temporal 
error. 

2. The method according to claim 1, wherein the step of 
providing a spatial attribute comprises the step of providing 
a size attribute indicative of said address space. 
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3. The method according to claim 2, wherein the step of 
providing a spatial attribute farther comprises the step of 
providing an address attribute indicative of a base address. 

4. The method according to claim 1, wherein said address 
space has a lower bound and an upper bound and wherein 
the step of providing a spatial attribute comprises the steps 
of: 

providing a low-end address attribute indicative of said 
lower bound; and 



providing a backpointer specifying said temporal valid- 
ity number location; and 
associating said backpointer with said pointer; and 
wherein the step of determining whether a temporal error 
has occurred comprises the step of accessing said 
temporal validity number via said backpointer. 
13. A method for detecting memory access errors which 
occur while executing a computer program, wherein the 
computer program includes a pointer to a data object, the 



providing a high-end address attribute indicative of said i0 method comprising the steps of: 



higher bound. 

5. The method according to claim 1, wherein the step of 
providing a temporal attribute further comprises the step of 
providing a storage class attribute. 

6. The method according to claim 5, wherein the step of 
providing a storage class attribute comprises the step of 
specifying storage classes which distinguish heap and local 
data objects. 

7. The method according to claim 6, wherein the step of 
providing a spatial attribute comprises the step of providing 
a size attribute indicative of the address space. 

8. The method according to claim 7, wherein the step of 
providing a spatial attribute further comprises the step of 
providing an address attribute indicative of a base address. 

9. The method according to claim 6, wherein said address 
space has a lower bound and an upper bound and wherein 
the step of providing a spatial attribute comprises the steps 
of: 

providing a low-end address attribute indicative of said 

lower bound; and 
providing a high-end address attribute indicative of said 

higher bound. 

10. A method of detecting a memory access error encoun- 
tered when executing program code including a pointer 
pointing to a variable, the method comprising the steps of: 

assigning temporal attributes defining temporal status for 
said pointer, wherein the step of assigning temporal 
attributes comprises the steps of: 
providing a temporal validity number associated with 

said variable; 
providing a temporal capability number; and 
associating said temporal capability number with said 

pointer; 

assigning spatial attributes defining an address space valid 

for said pointer; 
executing a memory access check on a memory access, 

wherein the step of executing a memory access check 

comprises the steps of: 

determining from said temporal capability number and 50 
said temporal validity number whether a temporal 
error has occurred; and 

determining whether said memory access is to an 
address within said valid address space; 
if a temporal status has occurred, flagging said temporal 

error; and 

if said memory access is to an address outside said valid 
address space, flagging a spatial error. 

11. The method according to claim 10 wherein the step of 
determining whether a temporal error has occurred com- 
prises the step of accessing said temporal validity number 
through a hash table. 

12. The method according to claim 10 wherein the tem- 
poral validity number has a temporal validity number loca- 
tion; 

wherein the step of assigning temporal attributes further 
comprises the steps of: 
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providing a counter having a count for tracking memory 
deallocations; 

providing a spatial attribute for said data object, said 
spatial attribute defining an address space within which 
valid accesses may be made to said data object; 

associating said spatial attribute with said pointer; 

associating said temporal attribute with said pointer; 

providing a temporal attribute for said data object, 
wherein the step of providing said temporal attribute 
comprises the steps of: 

providing a temporal validity number having a tempo- 
ral validity number location; and 
providing a temporal capability number; 
associating said temporal capability number with said 
pointer, 

providing a first dereference and a second dereference to 

said pointer wherein said first dereference occurs prior 

to said second dereference; 
determining if said second dereference falls outside said 

address space; 
if said second dereference falls outside said address space, 

flagging a spatial error; 
determining if said count has changed since said fast 

dereference; and 
if said count has changed since said first dereference, 

verifying said temporal attributes, wherein the step of 

verifying said temporal attributes comprises the steps 

of: 

determining from said temporal capability number and 
said temporal validity number whether a temporal 
error has occurred; and 

if a temporal error has occurred, flagging said temporal 
error. 

14. The method according to claim 13 wherein the step of 
providing a temporal attribute further comprises the step of 
providing a storage class attribute which distinguishes heap 
and local data objects. 

15. A system for detecting a memory access error in a 
computer program executing on a computer, the system 
comprising: 

means for assigning object attributes to a pointer, wherein 

said object attributes comprise: 

a temporal capability number, and 

a spatial attribute defining an address space valid for 
said pointer; 
means for assigning a temporal validity number; 
means for adding memory access check instructions to 

said computer program, wherein said memory access 

check instructions comprise: 

instructions which determine from said temporal capa- 
bility number and said temporal validity number 
whether a temporal error has occurred; and 

instructions which determine whether a memory access 
made within said computer program is to an address 
within said valid address space. 
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16. The system according to claim 15 wherein the instruc- 
tions for determining whether a temporal error has occurred 
comprise an instruction for accessing said temporal validity 
number through a hash table. 

17. The system according to claim 15 wherein: 

(he temporal validity number has a temporal validity 

number location; 
the temporal attributes further include a backpointer 

specifying said temporal validity number location, 

wherein said backpointer is associated with said 

pointer; and 

the instructions for determining whether a temporal error 
has occurred comprise an instruction for accessing said 
temporal validity number via said backpointer. 
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18. The system according to claim 15 wherein the instruc- 
tions for determining whether a temporal error has occurred 
comprise an instruction for determining whether a temporal 
memory access check can be elided. 

19. The system according to claim 15 wherein the instruc- 
tions for determining whether a spatial error has occurred 
comprise an instruction for determining whether a spatial 
memory access check can be elided. 

20. The system according to claim 15 wherein the means 
for adding memory access check instructions to said com- 
puter program comprise means for determining whether a 
memory access check can be elided. 
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